Reading metrics for estimating task efficiency with MT output

نویسندگان

  • Sigrid Klerke
  • Sheila Castilho
  • Maria Barrett
  • Anders Søgaard
چکیده

We show that metrics derived from recording gaze while reading, are better proxies for machine translation quality than automated metrics. With reliable eyetracking technologies becoming available for home computers and mobile devices, such metrics are readily available even in the absence of representative held-out human translations. In other words, readingderived MT metrics offer a way of getting cheap, online feedback for MT system adaptation.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Methods for human evaluation of machine translation

Evaluation of machine translation (MT) is a difficult task, both for humans, and using automatic metrics. The main difficulty lies in the fact that there is not one single correct translation, but many alternative good translation options. MT systems are often evaluated using automatic metrics, which commonly rely on comparing a translation to only a single human reference translation. An alter...

متن کامل

Sensitivity of Automated MT Evaluation Metrics on Higher Quality MT Output: BLEU vs Task-Based Evaluation Methods

We report the results of an experiment to assess the ability of automated MT evaluation metrics to remain sensitive to variations in MT quality as the average quality of the compared systems goes up. We compare two groups of metrics: those which measure the proximity of MT output to some reference translation, and those which evaluate the performance of some automated process on degraded MT out...

متن کامل

Ten Years of WMT Evaluation Campaigns: Lessons Learnt

The WMT evaluation campaign (http://www.statmt.org/wmt16) has been run annually since 2006. It is a collection of shared tasks related to machine translation, in which researchers compare their techniques against those of others in the field. The longest running task in the campaign is the translation task, where participants translate a common test set with their MT systems. In addition to the...

متن کامل

L2 Vocabulary Learning and the Use of Reading Tasks: Manipulating the Involvement Load Index

As Schmidt (2008) states, deeper engagement with new vocabulary as induced by tasks clearly increases the chances of learning those words. This engagement is theoretically clarified by the involvement load hypothesis (ILH, Laufer and Hulstijn, 2001), based on which the involvement index of each task can be measured. The present study was designed to test ILH by evaluating the impact of 4 differ...

متن کامل

L2 Vocabulary Learning and the Use of Reading Tasks: Manipulating the Involvement Load Index

As Schmidt (2008) states, deeper engagement with new vocabulary as induced by tasks clearly increases the chances of learning those words. This engagement is theoretically clarified by the involvement load hypothesis (ILH, Laufer and Hulstijn, 2001), based on which the involvement index of each task can be measured. The present study was designed to test ILH by evaluating the impact of 4 differ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015